Vector carrying a polynucleotide which encodes a polypeptide having the ability to stimulate an immune response against the polypeptide of Seq ID No:24

ABSTRACT

The invention provides a nucleotide sequence representing a pathogenicity island found in species of pathogenic mycobacteria. The islands are shown as SEQ ID NOS: 3 and 4 and comprises several open reading frames encoding polypeptides. These polypeptides and their use in diagnosis and therapy form a further aspect of the invention.

The present application is continuation of U.S. application Ser. No.10/805,311 (pending), filed Mar. 22, 2004, which is a divisional of U.S.application Ser. No. 09/705,911, filed Nov. 6, 2000 (abandoned), whichis a divisional of 09/091,538, filed Sep. 16, 1998 (now U.S. Pat. No.6,156,322), which is a 371 U.S. National Phase of PCT/GB96/03221, filedDec. 23, 1996, which claims benefit of GB 09526178.0 filed Dec. 21,1995, the entire contents of each of which is incorporated herein byreference.

This invention relates to the novel polynucleocide sequence we havedesignated “GS” which we have identified in pathogenic mycobacteria. GSis a pathogenicity island within 8 kb of DNA comprising a core region of5.75 kb and an adjacent transmissable element within 2.25 kb. GS iscontained within Mycobacterium paratuberculosis, Mycobacterium aviumsubsp. silvaticum and some pathogenic isolates of M. avium. Functionalportions of the core region of GS are also represented by regions with ahigh degree of homology that we have identified in cosmids containinggenomic DNA from Mycobacterium tuberculosis.

BACKGROUND TO THE INVENTION

Mycobacterium tuberculosis (Mtb) is a major cause of global diseases ofhumans as well as animals. Although conventional methods of diagnosisincluding microscopy, culture and skin testing exist for the recognitionof these diseases, improved methods particularly new immunodiagnosticsand DNA-based detection systems are needed. Drugs used to treattuberculosis are increasingly encountering the problem of resistantorganisms. New drugs targeted at specific pathogenicity determinants aswell as new vaccines for the prevention and treatment of tuberculosisare required. The importance of Mtb as a global pathogen is reflected inthe commitment being made to sequencing the entire genome of thisorganism. This has generated a large amount of DNA sequence data ofgenomic DNA within cosmid and other libraries. Although the DNA sequenceis known in the art, the functions of the vast majority of thesesequences, the proteins they encode, the biological significance ofthese proteins, and the overall relevance and use of these genes andtheir products as diagnostics, vaccines and targets for chemotherapy fortuberculous disease, remains entirely unknown.

Mycobacterium avium subsp. silvaticum (Mavs) is a pathogenicmycobacterium causing diseases of animals and birds, but it can alsoaffect humans. Mycobacterium paratuberculosis (Mptb) causes chronicinflammation of the intestine in many species of animals includingprimates and can also cause Crohn's disease in humans. Mptb isassociated with other chronic inflammatory diseases of humans such assarcoidosis. Subclinical Mptb infection is widespread in domesticlivestock and is present in milk from infected animals. The organism ismore resistant to pasteurisation than Mtb and can be conveyed to humansin retail milk supplies. Mptb is also present in water supplies,particularly those contaminated with run-off from heavily grazedpastures. Mptb and Mavs contain the insertion elements IS900 and IS902respectively, and these are linked to pathogenicity in these organisms.IS900 and IS902 provide convenient highly specific multi-copy DNAtargets for the sensitive detection of these organisms using DNA-basedmethods and for the diagnosis of infections in animals and humans. Muchimprovement is however required in the immunodiagnosis of Mptb and Mavsinfections in animals and humans. Mptb and Mavs are in general,resistant in vivo to standard anti-tuberculous drugs. Althoughsubstantial clinical improvements in infections caused by Mptb, such asCrohn's disease, may result from treatment of patients with combinationsof existing drugs such as Rifabutin, Clarithromycin or Azithromycin,additional effective drug treatments are required. Furthermore, there isan urgent need for effective vaccines for the prevention and treatmentof Mptb and Mavs infections in animals and humans based upon therecognition of specific pathogenicity determinants.

Pathogenicity islands are, in general, 7-9 kb regions of DNA comprisinga core domain with multiple ORFs and an adjacent transmissable element.The transmissable element also encodes proteins which may be linked topathogenicity, such as by providing receptors for cellular recognition.Pathogenicity islands are envisaged as mobile packages of DNA which,when they enter an organism, assist in bringing about its convertionfrom a non-disease-causing to a disease-causing strain.

DESCRIPTION OF THE DRAWINGS

FIGS. 1(a) and (b) shows a linear map of the pathogenicity island GS inMavs (FIG. 1 a) and in Mptb (FIG. 1 b). The main open reading frames areillustrated as ORFs A to H. ORFs A to F are found within the core regionof GS. ORFs G and H are encoded by the adjacent transmissable elementportion of GS.

DISCLOSURE OF THE INVENTION

Using a DNA-based differential analysis technology we have discoveredand characterised a novel polynucleotide in Mptb (isolates 0022 from aGuernsey cow and 0021 from a red deer). This polynucleotide comprisesthe gene region we have designated GS. GS is found in Mptb using theidentifier DNA sequences Seq. ID. No 1 and 2 where the Seq. ID No 2 isthe complementary sequence of Seq. ID No 1. GS is also identified inMavs. The complete DNA sequence incorporating the positive strand of GSfrom an isolate of Mavs comprising 7995 nucleotides, including the coreregion of GS and adjacent transsmissable element, is given in Seq. IDNo. 3. DNA sequence comprising 4435 bp of the positive strand of GSobtained from an isolate of Mptb including the core region of GS(nucleotides 1614 to 6047 of GS in Mavs) is given in Seq. ID No 4. TheDNA sequence of GS from Mptb is highly (99.4%) homologous to GS in Mavs.The remaining portion of the DNA sequence of GS in Mptb, is readilyobtainable by a person skilled in the art using standard laboratoryprocedures. The entire functional DNA sequence including core region andtransmisable element of GS in Mptb and Mavs as described above, comprisethe polynucleotide sequences of the invention.

There are 8 open reading frames (ORFs) in GS. Six of these designatedGSA, GSB, GSC, GSD, GSE and GSF are encoded by the core DNA region of GSwhich, characteristically for a pathogenicity island, has a different GCcontent than the rest of the microbial genome. Two ORFs designated GSGand GSH are encoded by the transmissable element of GS whose GC contentresembles that of the rest of the mycobacterial genome. The ORF GSHcomprises two sub-ORFs H₁ H₂ on the complementary DNA strand linked by aprogrammed frameshifting site so that a single polypeptide is translatedfrom the ORF GSH. The nucleotide sequences of the 8 ORFs in GS and theirtranslations are shown in Seq. ID No 5 to Seq. ID No 29 as follows:

-   ORF A: Seq. ID No 5 Nucleotides 50 to 427 of GS from Mavs    -   Seq. ID No 6 Amino acid sequence encoded by Seq. ID No 5.-   ORF B: Seq. ID No 7 Nucleotides 772 to 1605 of GS from Mavs    -   Seq. ID No 8 Amino acid sequence encoded by Seq. ID No 7.-   ORF C: Seq. ID No 9 Nucleotides 1814 to 2845 of GS from Mavs    -   Seq. ID No 10 Amino acid sequence encoded by Seq. ID No 9.    -   Seq. ID No 11 Nucleotides 201 to 1232 of GS from Mptb    -   Seq. ID No 12 Amino acid sequence encoded by Seq. ID No 11-   ORF D: Seq. ID No 13 Nucleotides 2785 to 3804 of GS from Mavs    -   Seq. ID No 14 Amino acid sequence encoded by Seq. ID No 13.    -   Seq. ID No 15 Nucleotides 1172 to 2191 of GS from Mptb    -   Seq. ID No 16 Amino acid sequence encoded by Seq. ID No 15.-   ORF E: Seq. ID No 17 Nucleotides 4080 to 4802 of GS from Mavs    -   Seq. ID No 18 Amino acid sequence encoded by Seq. ID No 17.    -   Seq. ID No 19 Nucleotides 2467 to 3189 of GS from Mptb    -   Seq. ID No 20 Amino acid sequence encoded by Seq. ID No 19.-   ORF F: Seq. ID No 21 Nucleotides 4947 to 5747 of GS from Mavs    -   Seq. ID No 22 Amino acid sequence encoded by Seq. ID No 21.    -   Seq. ID No 23 Nucleotides 3335 to 4135 of GS from Mptb    -   Seq. ID No 24 Amino acid sequence encoded by Seq. ID No 23.-   ORF G: Seq. ID No 25 Nucleotides 6176 to 7042 of GS from Mavs    -   Seq. ID No 26 Amino acid sequence encoded by Seq. ID No 25.-   ORF H: Seq. ID No 27 Nucleotides 7953 to 6215 from Mavs.-   ORF H₁: Seq. ID No 28 Amino acid sequence encoded by nucleotides    7953 to 7006 of Seq. ID No 27-   ORF H₂: Seq. ID No 29 Amino acid sequence encoded by nucleotides    7009 to 6215 of Seq. ID No 27

The polynucleotides in Mtb with homology to the ORFs B, C, E and F of GSin Mptb and Mavs, and the polypeptides they are now known to encode as aresult of our invention, are as follows:

-   ORF B: Seq. ID No 30 Cosmid MTCY277 nucleotides 35493 to 34705    -   Seq. ID No 31 Amino acid sequence encoded by Seq. ID No 30.-   ORF C: Seq. ID No 32 Cosmid MTCY277 nucleotides 31972 to 32994    -   Seq. ID No 33 Amino acid sequence encoded by Seq. ID No 32-   ORF E: Seq. ID No 34 Cosmid MTCY277 nucleotides 34687 to 33956    -   Seq. ID No 35 Amino acid sequence encoded by Seq. ID No 34-   ORF E: Seq. ID No 36 Cosmid MT024 nucleotides 15934 to 15203    -   Seq. ID No 37 Amino acid sequence encoded by Seq. ID No 36.-   ORF F: Seq. ID No 38 Cosmid MT024 nucleotides 15133 to 14306    -   Seq. ID No 39 Amino acid sequence encoded by Seq. ID No 38.

The proteins and peptides encoded by the ORFs A to H in Mptb and Mavsand the amino acid sequences from homologous genes we have discovered inMtb given in Seq. ID Nos 31, 33, 35, 37 and 39, as described above andfragments thereof, comprise the polypeptides of the invention. Thepolypeptides of the invention are believed to be associated withspecific immunoreactivity and with the pathogenicity of the hostmicro-organisms from which they were obtained.

The present invention thus provides a polynucleotide in substantiallyisolated form which is capable of selectively hybridising to sequence IDNos 3 or 4 or a fragment thereof. The polynucleotide fragment mayalternatively comprise a sequence selected from the group of Seq. ID.No: 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25 and 27. The inventionfurther provides a polynucleotide in substantially isolated form whosesequence consists essentially of a sequence selected from the group SeqID Nos. 30, 32, 34, 36 and 38, or a corresponding sequence selectivelyhybridizable thereto, or a fragment of said sequence or correspondingsequence.

The invention further provides diagnostic probes such as a probe whichcomprises a fragment of at least 15 nucleotides of a polynucleotide ofthe invention, or a peptide nucleic acid or similar synthetic sequencespecific ligand, optionally carrying a revealing label. The inventionalso provides a vector carrying a polynucleotide as defined above,particularly an expression vector.

The invention further provides a polypeptide in substantially isolatedform which comprises any one of the sequences selected from the groupconsisting Seq. ID. No: 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28,29, 31, 33, 35, 37 and 39, or a polypeptide substantially homologousthereto. The invention additionally provides a polypeptide fragmentwhich comprises a fragment of a polypeptide defined above, said fragmentcomprising at least 10 amino acids and an epitope. The invention alsoprovides polynucleotides in substantially isolated form which encodepolypeptides of the invention, and vectors which comprise suchpolynucleotides, as well as antibodies capable of binding suchpolypeptides. In an additional aspect, the invention provides kitscomprising polynucleotides, polypeptides, antibodies or syntheticligands of the invention and methods of using such kits in diagnosingthe presence or absence of mycobacteria in a sample. The invention alsoprovides pharmaceutical compositions comprising polynucleotides of theinvention, polypeptides of the invention or antisense probes and the useof such compositions in the treatment or prevention of diseases causedby mycobacteria. The invention also provides polynucleotihe preventionand treatment of infections due to GS-containing pathogenic mycobacteriain animals and humans and as a means of enhacing in vivo susceptibilityof said mycobacteria to antimicrobial drugs. The invention also providesbacteria or viruses transformed with polynucleotides of the inventionfor use as vaccines. The invention further provides Mptb or Mavs inwhich all or part or the polynucleotides of the invention have beendeleted or disabled to provide mutated organisms of lower pathogenicityfor use as vaccines in animals and humans. The invention furtherprovides Mtb in which all or part of the polynucleotides encodingpolypeptides of the invention have been deleted or disabled to providedmutated organisms or lower pathogenicity for use as vaccines in animalsand humans.

A further aspect of the invention is our discovery of homologies betweenthe ORFs B, C and E in GS on the one hand, and Mtb cosmid MTCY277 on theother (data from Genbank database using the computer programmes BLASTand BLIXEM). The homologous ORFs in MTCY277 are adjacent to one anotherconsistent with the form of another pathogenicity island in Mtb. Afurther aspect of the invention is our discovery of homologies betweenORFs E and F in GS, and Mtb cosmid MT024 (also Genbank, as above) withthe homologous ORFs close to one another. The use of polynucleotides andpolypeptides from Mtb (Seq. ID Nos 30, 31, 32, 33, 34, 35, 36, 37, 38and 39) in substantially isolated form as diagnostics, vaccines andtargets for chemotherapy, for the management and prevention of Mtbinfections in humans and animals, and the processes involved in thepreparation and use of these diagnostics, vaccines and newchemotherapeutic agents, comprise further aspects of the invention.

DETAILED DESCRIPTION OF THE INVENTION

A. Polynucleotides

Polynucleotides of the invention as defined herein may comprise DNA orRNA. They may also be polynucleotides which include within themsynthetic or modified nucleotides or peptide nucleic acids. A number ofdifferent types of modification to oligonucleotides are known in theart. These include methylphosphonate and phosphorothioate backbones,addition of acridine or polylysine chains at the 3′ and/or 5′ ends ofthe molecule. For the purposes of the present invention, it is to beunderstood that the polynucleotides described herein may be modified byany method available in the art. Such modifications may be carried outin order to couple the said polynucleotide to a solid phase or toenhance the recognition, the in vivo activity, or the lifespan ofpolynucleotides of the invention.

A number of different types of polynucleotides of the invention areenvisaged. In the broadest aspect, polynucleotides and fragments thereofcapable of hybridizing to SEQ ID NO:3 or 4 form a first aspect of theinvention. This includes the polynucleotide of SEQ ID NO: 3 or 4. Withinthis class of polynucleotides various sub-classes of polynucleotides areof particular interest.

One sub-class of polynucleotides which is of interest is the class ofpolynucleotides encoding the open reading frames A, B, C, D, E, F, G andH, including SEQ ID NOs:5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25 and 27.As discussed below, polynucleotides encoding ORF H include thepolynucleotide sequences 7953 to 7006 and 7009 to 6215 within SEQ ID NO:27, as well as modified sequences in which the frame-shift has beenmodified so that the two sub-reading frames are placed in a singlereading frame. This may be desirable where the polypeptide is to beproduced in recombinant expression systems.

The invention thus provides a polynucleotide in substantially isolatedform which encodes any one of these ORFs or combinations thereof.Combinations thereof includes combinations of 2, 3, 4, 5 or all of theORFS. Polynucleotides may be provided which comprise an individual ORFcarried in a recombinant vector including the vectors described herein.Thus in one preferred aspect the invention provides a polynucleotide insubstantially isolated form capable of selectively hybridizing to thenucleic acid comprising ORFs A to F of the core region of the Mptb andMavs pathogenicity islands of the invention. Fragments thereofcorresponding to ORFs A to E, B to F, A to D, B to E, A to C, B to D orany two adjacent ORFs are also included in the invention.

Polynucleotides of the invention will be capable of selectivelyhybridizing to the corresponding portion of the GS region, or to thecorresponding ORFs of Mtb described herein. The term “selectivelyhybridizing” indicates that the polynucleotides will hybridize, underconditions of medium to high stringency (for example 0.03 M sodiumchloride and 0.03 M sodium citrate at from about 50° C. to about 60° C.)to the corresponding portion of SEQ ID NO:3 or 4 or the complementarystrands thereof but not to genomic DNA from mycobacteria which areusually non-pathogenic including non-pathogenic species of M. avium.Such polynucleotides will generally be generally at least 68%, e.g. atleast 70%, preferably at least 80 or 90% and more preferably at least95% homologous to the corresponding DNA of GS. The corresponding portionwill be of over a region of at least 20, preferably at least 30, forinstance at least 40, 60 or 100 or more contiguous nucleotides.

By “corresponding portion” it is meant a sequence from the GS region ofthe same or substantially similar size which has been determined, forexample by computer alignment, to have the greatest degree of homologyto the polynucleotide.

Any combination of the above mentioned degrees of homology and minimumsizes may be used to define polynucleotides of the invention, with themore stringent combinations (i.e. higher homology over longer lengths)being preferred. Thus for example a polynucleotide which is at least 80%homologous over 25, preferably 30 nucleotides forms one aspect of theinvention, as does a polynucleotide which is at least 90% homologousover 40 nucleotides.

A further class of polynucleotides of the invention is the class ofpolynucleotides encoding polypeptides of the invention, the polypeptidesof the invention being defined in section B below. Due to the redundancyof the genetic code as such, polynucleotides may be of a lower degree ofhomology than required for selective hybridization to the GS region.However, when such polynucleotides encode polypeptides of the inventionthese polynucleotides form a further aspect. It may for example bedesirable where polypeptides of the invention are produced recombinantlyto increase the GC content of such polynucleotides. This increase in GCcontent may result in higher levels of expression via codon usage moreappropriate to the host cell in which recombinant expression is takingplace.

An additional class of polynucleotides of the invention are thoseobtainable from cosmids MTCY277 and MT024 (containing Mtb genomicsequences), which polynucleotides consist essentially of the fragment ofthe cosmid containing an open reading frame encoding any one of thehomologous ORFs B, C, E or F respectively. Such polynucleotides arereferred to below as Mtb polynucleotides. However, where reference ismade to polynucleotides in general such reference includes Mtbpolynucleotides unless the context is explicitly to the contrary. Inaddition, the invention provides polynucleotides which encode the samepolypeptide as the abovementioned ORFs of Mtb but which, due to theredundancy of the genetic code, have different nucleotide sequences.These form further Mtb polynucleotides of the invention. Fragments ofMtb polynucleotides suitable for use as probes or primers also form afurther aspect of the invention.

The invention further provides polynucleotides in substantially isolatedform capable of selectively hybridizing (where selectively hybridizingis as defined above) to the Mtb polynucleotides of the invention.

The invention further provides the Mtb polynucleotides of the inventionlinked, at either the 5′ and/or 3′ end to polynucleotide sequences towhich they are not naturally contiguous. Such sequences will typicallybe sequences found in cloning or expression vectors, such as promoters,5′ untranslated sequence, 3′ untranslated sequence or terminationsequences. The sequences may also include further coding sequences suchas signal sequences used in recombinant production of proteins.

Further polynucleotides of the invention are illustrated in theaccompanying examples.

Polynucleotides of the invention may be used to produce a primer, e.g. aPCR primer, a primer for an alternative amplification reaction, a probee.g. labelled with a revealing label by conventional means usingradioactive or non-radioactive labels or a probe linked covalently to asolid phase, or the polynucleotides may be cloned into vectors. Suchprimers, probes and other fragments will be at least 15, preferably atleast 20, for example at least 25, 30 or 40 or more nucleotides inlength, and are also encompassed by the term polynucleotides of theinvention as used herein.

Primers of the invention which are preferred include primers directed toany part of the ORFs defined herein. The ORFs from other isolates ofpathogenic mycobacteria which contain a GS region may be determined andconserved regions within each individual ORF may be identified. Primersdirected to such conserved regions form a further preferred aspect ofthe invention. In addition, the primers and other polynucleotides of theinvention may be used to identify, obtain and isolate ORFs capable ofselectively hybridizing to the polynucleotides of the invention whichare present in pathogenic mycobacteria but which are not part of apathogenicity island in that particular species of bacteria. Thus inaddition to the ORFs B, C, E and F which have been identified in Mtb,similar ORFs may be identified in other pathogens and ORFs correspondingto the GS ORFs C, D, E, F and H, may also be identified.

Polynucleotides such as DNA polynucleotides and probes according to theinvention may be produced recombinantly, synthetically, or by any meansavailable to those of skill in the art. They may also be cloned bystandard techniques.

In general, primers will be produced by synthetic means, involving astep-wise manufacture of the desired nucleic acid sequence onenucleotide at a time. Techniques for accomplishing this using automatedtechniques are readily available in the art. Longer polynucleotides willgenerally be produced using recombinant means, for example using a PCR(polymerase chain reaction) cloning techniques. This will involve makinga pair or primers (e.g. of about 15-30 nucleotides) to a region of GS,which it is desired to clone, bringing the primers into contact withgenomic DNA from a mycobacterium or a vector carrying the GS sequence,performing a polymerase chain reaction under conditions which bringabout amplification of the desired region, isolating the amplifiedfragment (e.g. by purifying the reaction mixture on an agarose gel) andrecovering the amplified DNA. The primers may be designed to containsuitable restriction enzyme recognition sites so that the amplified DNAcan be cloned into a suitable cloning vector.

Such techniques may be used to obtain all or part of the GS or ORFsequences described herein, as well as further genomic clones containingfull open reading frames. Although in general such techniques are wellknown in the art, reference may be made in particular to Sambrook J.,Fritsch E F., Maniatis T (1989). Molecular cloning: a Laboratory Manual,2nd edn. Cold Spring Harbor, N.Y., Cold Spring Harbor Laboratory.

Polynucleotides which are not 100% homologous to the sequences of thepresent invention but fall within the scope of the invention can beobtained in a number of ways.

Other isolates or strains of pathogenic mycobacteria will be expected tocontain allelic variants of the GS sequences described herein, and thesemay be obtained for example by probing genomic DNA libraries made fromsuch isolates or strains of bacteria using GS or ORF sequences as probesunder conditions of medium to high stringency (for example 0.03M sodiumchloride and 0.03M sodium citrate at from about 50° C. to about 60° C.).

A particularly preferred group of pathogenic mycobacteria are isolatesof M. paratuberculosis. Plynucleotides based on GS regions from suchbacteria are particularly preferred. Preferred fragments of such regionsinclude fragments encoding individual open reading frames including thepreferred groups and combinations of open reading frames discussedabove.

Alternatively, such polynucleotides may be obtained by site directedmutagenesis of the GS or ORF sequences or allelic variants thereof. Thismay be useful where for example silent codon changes are required tosequences to optimise codon preferences for a particular host cell inwhich the polynucleotide sequences are being expressed. Other sequencechanges may be desired in order to introduce restriction enzymerecognition sites, or to alter the property or function of thepolypeptides encoded by the polynucleotides of the invention. Suchaltered property or function will include the addition of amino acidsequences of consensus signal peptides known in the art to effecttransport and secretion of the modified polypeptide of the invention.Another altered property will include metagenesis of a catalytic residueor generation of fusion proteins with another polypeptide. Such fusionproteins may be with an enzyme, with an antibody or with a cytokine orother ligand for a receptor, to target a polypeptide of the invention toa specific cell type in vitro or in vivo.

The invention further provides double stranded polynucleotidescomprising a polynucleotide of the invention and its complement.

Polynucleotides or primers of the invention may carry a revealing label.Suitable labels include radioisotopes such as ³²P or ³⁵S, enzyme labels,other protein labels or smaller labels such as biotin or fluorophores.Such labels may be added to polynucleotides or primers of the inventionand may be detected using by techniques known per se.

Polynucleotides or primers of the invention or fragments thereoflabelled or unlabelled may be used by a person skilled in the art innucleic acid-based tests for the presence or absence of Mptb, Mavs,other GS-containing pathogenic mycobacteria, or Mtb applied to samplesof body fluids, tissues, or excreta from animals and humans, as well asto food and environmental samples such as river or ground water anddomestic water supplies.

Human and animal body fluids include sputum, blood, serum, plasma,saliva, milk, urine, csf, semen, faeces and infected discharges. Tissuesinclude intestine, mouth ulcers, skin, lymph nodes, spleen, lung andliver obtained surgically or by a biopsy technique. Animals particularlyinclude commercial livestock such as cattle, sheep, goats, deer, rabbitsbut wild animals and animals in zoos may also be tested.

Such tests comprise bringing a human or animal body fluid or tissueextract, or an extract of an environmental or food sample, into contactwith a probe comprising a polynucleotide or primer of the inventionunder hybridising conditions and detecting any duplex formed between theprobe and nucleic acid in the sample. Such detection may be achievedusing techniques such as PCR or by immobilising the probe on a solidsupport, removing nucleic acid in the sample which is not hybridized tothe probe, and then detecting nucleic acid which has hybridized to theprobe. Alternatively, the sample nucleic acid may be immobilized on asolid support, and the amount of probe bound to such a support can bedetected. Suitable assay methods of this any other formats can be foundin for example WO89/03891 and WO90/13667.

Polynucleotides of the invention or fragments thereof labelled orunlabelled may also be used to identify and characterise differentstrains of Mptb, Mavs, other GS-containing pathogenic mycobacteria, orMtb, and properties such as drug resistance or susceptibility.

The probes of the invention may conveniently be packaged in the form ofa test kit in a suitable container. In such kits the probe may be boundto a solid support where the assay format for which the kit is designedrequires such binding. The kit may also contain suitable reagents fortreating the sample to be probed, hybridising the probe to nucleic acidin the sample, control reagents, instructions, and the like.

The use of polynucleotides of the invention in the diagnosis ofinflammatory diseases such as Crohn's disease or sarcoidosis in humansor Johne's disease in animals form a preferred aspect of the invention.The polynucleotides may also be used in the prognosis of these diseases.For example, the response of a human or animal subject in response toantibiotic, vaccination or other therapies may be monitored by utilizingthe diagnostic methods of the invention over the course of a period oftreatment and following such treatment.

The use of Mtb polynucleotides (particularly in the form of probes andprimers) of the invention in the above-described methods form a furtheraspect of the invention, particularly for the detection, diagnosis orprognosis of Mtb infections.

B. Polypeptides.

Polypeptides of the invention include polypeptides in substantiallyisolated form encoded by GS. This includes the full length polypeptidesencoded by the positive and complementary negative strands of GS. Eachof the full length polypeptides will contain one of the amino acidsequences set out in Seq ID NOs:6, 8, 10, 12, 14, 16, 18, 20, 22, 24,26, 28 and 29. Polypeptides of the invention further include variants ofsuch sequences, including naturally occurring allelic variants andsynthetic variants which are substantially homologous to saidpolypeptides. In this context, substantial homology is regarded as asequence which has at least 70%, e.g. 80%, 90%, 95% or 98% amino acidhomology (identity) over 30 or more, e.g 40, 50 or 100 amino acids. Forexample, one group of substantially homolgous polypeptides are thosewhich have at least 95% amino acid identity to a polypeptide of any oneof Seq ID NOs:6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 29 overtheir entire length. Even more preferably, this homology is 98%.

Polypeptides of the invention further include the polypeptide sequencesof the homologous ORFs of Mtb, namely Seq ID Nos. 31, 33, 35, 37 and 39.Unless explicitly specified to the contrary, reference to polypeptidesof the invention and their fragments include these Mtb polypeptides andfragments, and variants thereof (substanially homologous to saidsequences) as defined herein.

Polypeptides of the invention may be obtained by the standard techniquesmentioned above. Polypeptides of the invention also include fragments ofthe above mentioned full length polypeptides and variants thereof,including fragments of the sequences set out in SEQ ID NOs:6, 8, 10, 12,14, 16, 18, 20, 22, 24, 26, 28, 29, 31, 33, 35, 37 and 39. Suchfragments for example of 8, 10, 12, 15 or up to 30 or 40 amino acids mayalso be obtained synthetically using standard techniques known in theart.

Preferred fragments include those which include an epitope, especiallyan epitope which is specific to the pathogenicity of the mycobacterialcell from which the polypeptide is derived. Suitable fragments will beat least about 5, e.g. 8, 10, 12, 15 or 20 amino acids in size, orlarger. Epitopes may be determined either by techniques such as peptidescanning techniques as described by Geysen et al, Mol. Immunol., 23;709-715 (1986), as well as other techniques known in the art.

The term “an epitope which is specific to the pathogenicity of themycobacterial cell” means that the epitope is encoded by a portion ofthe GS region, or by the corresponding ORF sequences of Mtb which can beused to distinguish mycobacteria which are pathogenic by from relatednon-pathogenic mycobacteria including non-pathogenic species of M.avium. This may be determined using routine methodology. A candidateepitope from an ORF may be prepared and used to immunise an animal suchas a rat or rabbit in order to generate antibodies. The antibodies maythen be used to detect the presence of the epitope in pathogenicmycobacteria and to confirm that non-pathogenic mycobacteria do notcontain any proteins which react with the epitope. Epitopes may belinear or conformational.

Polypeptides of the invention may be in a substantially isolated form.It will be understood that the polypeptide may be mixed with carriers ordiluents which will not interfere with the intended purpose of thepolypeptide and still be regarded as substantially isolated. Apolypeptide of the invention may also be in a substantially purifiedform, in which case it will generally comprise the polypeptide in apreparation in which more than 90%, e.g. 95%, 98% or 99% of thepolypeptide in the preparation is a polypeptide of the invention.

Polypeptides of the invention may be modified to confer a desiredproperty or function for example by the addition of Histidine residuesto assist their purification or by the addition of a signal sequence topromote their secretion from a cell.

Thus, polypeptides of the invention include fusion proteins whichcomprise a polypeptide encoding all or part of one or more of an ORF ofthe invention fused at the N- or C-terminus to a second sequence toprovide the desired property or function. Sequences which promotesecretion from a cell include, for example the yeast α-factor signalsequence.

A polypeptide of the invention may be labelled with a revealing label.The revealing label may be any suitable label which allows thepolypeptide to be detected. Suitable labels include radioisotopes, e.g.¹²⁵I, ³⁵S enzymes, antibodies, polynucleotides and ligands such asbiotin. Labelled polypeptides of the invention may be used in diagnosticprocedures such as immunoassays in order to determine the amount of apolypeptide of the invention in a sample. Polypeptides or labelledpolypeptides of the invention may also be used in serological or cellmediated immune assays for the detection of immune reactivity to saidpolypeptides in animals and humans using standard protocols.

A polypeptide or labelled polypeptide of the invention or fragmentthereof may also be fixed to a solid phase, for example the surface ofan immunoassay well, microparticle, dipstick or biosensor. Such labelledand/or immobilized polypeptides may be packaged into kits in a suitablecontainer along with suitable reagents, controls, instructions and thelike.

Such polypeptides and kits may be used in methods of detection ofantibodies or cell mediated immunoreactivity, to the mycobacterialproteins and peptides encoded by the ORFs of the invention and theirallelic variants and fragments, using immunoassay. Such host antibodiesor cell mediated immune reactivity will occur in humans or animals withan immune system which detects and reacts against polypeptides of theinvention. The antibodies may be present in a biological sample fromsuch humans or animals, where the biological sample may be a sample asdefined above particularly blood, milk or saliva.

Immunoassay methods are well known in the art and will generallycomprise:

-   -   (a) providing a polypeptide of the invention comprising an        epitope bindable by an antibody against said mycobacterial        polypeptide;    -   (b) incubating a biological sample with said polypeptide under        conditions which allow for the formation of an antibody-antigen        complex; and    -   (c) determining whether antibody-antigen complex comprising said        polypeptide is formed.

Immunoassay methods for cell mediated immune reactivity in animals andhumans are also well known in the art (e.g. as described by Weir et al1994, J. Immunol Methods 176; 93-101) and will generally comprise

-   -   (a) providing a polypeptide of the invention comprising an        epitope bindable by a lymphocyte or macrophage or other cell        receptor;    -   (b) incubating a cell sample with said polypeptide under        conditions which allow for a cellular immune response such as        release of cytokines or other mediator to occur; and    -   (c) detecting the presence of said cytokine or mediator in the        incubate.

Polypeptides of the invention may be made by standard synthetic meanswell known in the art or recombinantly, as described below.

Polypeptides of the invention or fragments thereof labelled orunlabelled may also be used to identify and characterise differentstrains of Mptb, Mavs, other GS-containing pathogenic mycobacteria, orMtb, and properties such as drug resistance or susceptibility.

The polypeptides of the invention may conveniently be packaged in theform of a test kit in a suitable container. In such kits the polypeptidemay be bound to a solid support where the assay format for which the kitis designed requires such binding. The kit may also contain suitablereagents for treating the sample to be examined, control reagents,instructions, and the like.

The use of polypeptides of the invention in the diagnosis ofinflammatory diseases such as Crohn's disease or sarcoidosis in humansor Johne's disease in animals form a preferred aspect of the invention.The polypeptides may also be used in the prognosis of these diseases.For example, the response of a human or animal subject in response toantibiotic or other therapies may be monitored by utilizing thediagnostic methods of the invention over the course of a period oftreatment and following such treatment.

The use of Mtb polypeptides of the invention in the above-describedmethods form a further aspect of the invention, particularly for thedetection, diagnosis or prognosis of Mtb infections.

Polypeptides of the invention may also be used in assay methods foridentifying candidate chemical compounds which will be useful ininhibiting, binding to or disrupting the function of said polypeptidesrequired for pathogenicity. In general, such assays involve bringing thepolypeptide into contact with a candidate inhibitor compound andobserving the ability of the compound to disrupt, bind to or interferwith the polypeptide.

There are a number of ways in which the assay may be formatted. Forexample, those polypeptides which have an enzymatic function may beassayed using labelled substrates for the enzyme, and the amount of, orrate of, conversion of the substrate into a product measured, e.g bychromatograpy such as HPLC or by a colourimetric assay. Suitable labelsinclude ³⁵S, ¹²⁵I, biotin or enzymes such as horse radish peroxidase.

For example, the gene product of ORF C is believed to have GDP-mannosedehydratase activty. Thus an assay for inhbitors of the gene product mayutilise for example labelled GDP-mannose, GDP or mannose and theactivity of the gene product followed. ORF D encodes a gene related tothe synthesis and regulation of capuslar polysaccharides, which areoften associated with invasiveness and pathogenicity. Labelledpolysaccharide substrates may be used in assays of the ORF D geneproduct. The gene product of ORF F encodes a protein with putativeglucosyl transferase activity and thus labelled amino sugars such asβ-1-3-N-acetylglucosamine may be used as substrates in assays.

Candidate chemical compounds which may be used may be natural orsynthetic chemical compounds used in drug screening programmes. Extractsof plants which contain several characterised or uncharacterisedcomponents may also be used.

Alternatively, the a polypeptide of the invention may be screenedagainst a panel of peptides, nucleic acids or other chemicalfunctionalities which are generated by combinatorial chemistry. Thiswill allow the definition of chemical entities which bind topolypeptides of the invention. Typically, the polypeptide of theinvention will be brought into contact with a panel of compounds from acombinantorial library, with either the panel or the polypeptide beingimmobilized on a solid phase, under conditions suitable for thepolypeptide to bind to the panel. The solid phase will then be washedunder conditions in which only specific interactions between thepolypeptide and individual members of the panel are retained, and thosespecific members may be utilized in further assays or used to designfurther panels of candidate compounds.

For example, a number of assay methods to define peptide interactionwith peptides are known. For example, WO86/00991 describes a method fordetermining mimotopes which comprises making panels of catamerpreparations, for example octamers of amino acids, at which one or moreof the positions is defined and the remaining positions are randomlymade up of other amino acids, determining which catamer binds to aprotein of interest and re-screening the protein of interest against afurther panel based on the most reactive catamer in which one or moreadditional designated positions are systematically varied. This may berepeated throughout a number of cycles and used to build up a sequenceof a binding candidate compound of interest.

WO89/03430 describes screening methods which permit the preparation ofspecific mimotopes which mimic the immunological activity of a desiredanalyte. These mimotopes are identified by reacting a panel ofindividual peptides wherein said peptides are of systematically varyinghydrophobicity, amphipathic characteristics and charge patterns, usingan antibody against an antigen of interest. Thus in the present caseantibodies against the a polypeptide of the inventoin may be employedand mimotope peptides from such panels may be identified.

C. Vectors.

Polynucleotides of the invention can be incorporated into a recombinantreplicable vector. The vector may be used to replicate the nucleic acidin a compatible host cell. Thus in a further embodiment, the inventionprovides a method of making polynucleotides of the invention byintroducing a polynucleotide of the invention into a replicable vector,introducing the vector into a compatible host cell, and growing the hostcell under conditions which bring about replication of the vector. Thevector may be recovered from the host cell. Suitable host cells aredescribed below in connection with expression vectors.

D. Expression Vectors.

Preferably, a polynucleotide of the invention in a vector is operablylinked to a control sequence which is capable of providing for theexpression of the coding sequence by the host cell, i.e. the vector isan expression vector. The term “operably linked” refers to ajuxtaposition wherein the components described are in a relationshippermitting them to function in their intended manner. A control sequence“operably linked” to a coding sequence is ligated in such a way thatexpression of the coding sequence is achieved under conditionscompatible with the control sequences. Such vectors may be transformedinto a suitable host cell as described above to provide for expressionof a polypeptide of the invention. Thus, in a further aspect theinvention provides a process for preparing polypeptides according to theinvention which comprises cultivating a host cell transformed ortransfected with an expression vector as described above, underconditions to provide for expression by the vector of a coding sequenceencoding the polypeptides, and recovering the expressed polypeptides.

A further embodiment of the invention provides vectors for thereplication and expression of polynucleotides of the invention, orfragments thereof. The vectors may be for example, plasmid, virus orphage vectors provided with an origin of replication, optionally apromoter for the expression of the said polynucleotide and optionally aregulator of the promoter. The vectors may contain one or moreselectable marker genes, for example an ampicillin resistance gene inthe case of a bacterial plasmid or a neomycin resistance gene for amammalian vector. Vectors may be used in vitro, for example for theproduction of RNA or used to transfect or transform a host cell. Thevector may also be adapted to be used in vivo, for example in a methodof naked DNA vaccination or gene therapy. A further embodiment of theinvention provides host cells transformed or transfected with thevectors for the replication and expression of polynucleotides of theinvention, including the DNA of GS, the open reading frames thereof andother corresponding ORFs particularly ORFs B, C, E and F from Mtb. Thecells will be chosen to be compatible with the said vector and may forexample be bacterial, yeast, insect or mammalian.

Expression vectors are widely available in the art and can be obtainedcommercially. Mammalian expression vectors may comprise a mammalian orviral promoter. Mammalian promoters include the metallothionienpromoter. Viral promoters include promoters from adenovirus, the SV40large T promoter and retroviral LTR promoters. Promoters compatible withinsect cells include the polyhedrin promoter. Yeast promoters includethe alcohol dehydrogenase promoter. Bacterial promoters include theβ-galactosidase promoter.

The expression vectors may also comprise enhancers, and in the case ofeukaryotic vectors polyadenylation signal sequence downstream of thecoding sequence being expressed.

Polypeptides of the invention may be expressed in suitable host cells,for example bacterial, yeast, plant, insect and mammalian cells, andrecovered using standard purification techniques including, for exampleaffinity chromatography, HPLC or other chromatographic separationtechniques.

Polynucleotides according to the invention may also be inserted into thevectors described above in an antisense orientation in order to providefor the production of antisense RNA. Antisense RNA or other antisensepolynucleotides or ligands may also be produced by synthetic means. Suchantisense polynucleotides may be used in a method of controlling thelevels of the proteins encoded by the ORFs of the invention in amycobacterial cell.

Polynucleotides of the invention may also be carried by vectors suitablefor gene therapy methods. Such gene therapy methods include thosedesigned to provide vaccination against diseases caused by pathogenicmycobacteria or to boost the immune response of a human or animalinfected with a pathogenic mycobacteria.

For example, Ziegner et al, AIDS, 1995, 9;43-50 describes the use of areplication defective recombinant amphotropic retrovirus to boost theimmune response in patients with HIV infection. Such a retrovirus may bemodified to carry a polynucleotide encoding a polypeptide or fragmentthereof of the invention and the retrovirus delivered to the cells of ahuman or animal subject in order to provide an immune response againstsaid polypeptide. The retrovirus may be delivered directly to thepatient or may be used to infecte cells ex-vivo, e.g. fibroblast cells,which are then introduced into the patient, optionally after beinginactivated. The cells are desirably autologous or HLA-matched cellsfrom the human or animal subject.

Gene therapy methods including methods for boosting an immune responseto a particluar pathogen are disclosed generally in for exampleWO95/14091, the disclosure of which is incoporated herein by reference.Recombinant viral vectors include retroviral vectors, adenoviralvectors, adeno-associated viral vectors, vaccinia virus vectors, herpesvirus vectors and alphavirus vectors. Alpha virus vectors are describedin, for example, WO95/07994, the disclosure of which is incorporatedherein by reference.

Where direct administration of the recombinant viral vector iscontemplated, either in the form of naked nucleic acid or in the form ofpackaged particles carrying the nucleic acid this may be done by anysuitable means, for example oral administration or intravenousinjection. From 10⁵ to 10⁸ c.f.u of virus represents a typical dose,which may be repeated for example weekly over a period of a few months.Administration of autologous or HLA-matched cells infected with thevirus may be more convenient in some cases. This will generally beachieved by administering doses, for example from 10⁵ to 10⁸ cells perdose which may be repeated as described above.

The recombinant viral vector may further comprise nucleic acid capableof expressing an accessory molecule of the immune system designed toincrease the immune response. Such a moleclue may be for example andinterferon, particularly interferon gamma, an interleukin, for exampleIL-1α, IL-1β or IL-2, or an HLA class I or II moleclue. This may beparticularly desirable where the vector is intended for use in thetreatment of humans or animals already infected with a mycobacteria andit is desired to boost the immune response.

E. Antibodies.

The invention also provides monoclonal or polyclonal antibodies topolypeptides of the invention or fragments thereof. The inventionfurther provides a process for the production of monoclonal orpolyclonal antibodies to polypeptides of the invention. Monoclonalantibodies may be prepared by conventional hybridoma technology usingthe polypeptides of the invention or peptide fragments thereof, asimmunogens. Polyclonal antibodies may also be prepared by conventionalmeans which comprise inoculating a host animal, for example a rat or arabbit, with a polypeptide of the invention or peptide fragment thereofand recovering immune serum.

In order that such antibodies may be made, the invention also providespolypeptides of the invention or fragments thereof haptenised to anotherpolypeptide for use as immunogens in animals or humans.

For the purposes of this invention, the term “antibody”, unlessspecified to the contrary, includes fragments of whole antibodies whichretain their binding activity for a polypeptide of the invention. Suchfragments include Fv, F(ab′) and F(ab′)₂ fragments, as well as singlechain antibodies. Furthermore, the antibodies and fragments thereof maybe humanised antibodies, e.g. as described in EP-A-239400.

Antibodies may be used in methods of detecting polypeptides of theinvention present in biological samples (where such samples include thehuman or animal body samples, and environmental samples, mentionedabove) by a method which comprises:

-   -   (a) providing an antibody of the invention;    -   (b) incubating a biological sample with said antibody under        conditions which allow for the formation of an antibody-antigen        complex; and    -   (c) determining whether antibody-antigen complex comprising said        antibody is formed.

Antibodies of the invention may be bound to a solid support for examplean immunoassay well, microparticle, dipstick or biosensor and/orpackaged into kits in a suitable container along with suitable reagents,controls, instructions and the like.

Antibodies of the invention may be used in the detection, diagnosis andprognosis of diseases as descirbed above in relation to polypeptides ofthe invention.

F. Compositions.

The present invention also provides compositions comprising apolynucleotide or polypeptide of the invention together with a carrieror diluent. Compositions of the invention also include compositionscomprising a nucleic acid, particularly and expression vector, of theinvention. Compositions further include those carrying a recombinantvirus of the invention. Such compositions include pharmaceuticalcompositions in which case the carrier or diluent will bepharmaceutically acceptable.

Pharmaceutically acceptable carriers or diluents include those used informulations suitable for inhalation as well as oral, parenteral (e.g.intramuscular or intravenous or transcutaneous) administration. Theformulations may conveniently be presented in unit dosage form and maybe prepared by any of the methods well known in the art of pharmacy.Such methods include the step of bringing into association the activeingredient with the carrier which constitutes one or more accessoryingredients. In general the formulations are prepared by uniformly andintimately bringing into association the active ingredient with liquidcarriers or finely divided solid carriers or both, and then, ifnecessary, shaping the product.

For example, formulations suitable for parenteral administration includeaqueous and non-aqueous sterile injection solutions which may containanti-oxidants, buffers, bacteriostats and solutes which render theformulation isotonic with the blood of the intended recipient, andaqueous and non-aqueous sterile suspensions which may include suspendingagents and thickening agents, and liposomes or other microparticulatesystems which are designed to target the polynucleotide or thepolypeptide of the invention to blood components or one or more organs,or to target cells such as M cells of the intestine after oraladministration.

G. Vaccines.

In another aspect, the invention provides novel vaccines for theprevention and treatment of infections caused by Mptb, Mavs, otherGS-containing pathogenic mycobacteria and Mtb in animals and humans. Theterm “vaccine” as used herein means an agent used to stimulate theimmune system of a vertebrate, particularly a warm blooded vertebrateincluding humans, so as to provide protection against future harm by anorganism to which the vaccine is directed or to assist in theeradication of an organism in the treatment of established infection.The immune system will be stimulated by the production of cellularimmunity antibodies, desirably neutralizing antibodies, directed toepitopes found on or in a pathogenic mycobacterium which expresses anyone of the ORFs of the invention. The antibody so produced may be any ofthe immunological classes, such as the immunoglobulins A, D, E, G or M.Vaccines which stimulate the production of IgA are interest since thisis the principle immunoglobulin produced by the secretory system ofwarm-blooded animals, and the production of such antibodies will helpprevent infection or colonization of the intestinal tract. However anIgM and IgG response will also be desirable for systemic infections suchas Crohn's disease or tuberculosis.

Vaccines of the invention include polynucleotides of the invention orfragments thereof in suitable vectors and administered by injection ofnaked DNA using standard protocols. Polynucleotides of the invention orfragments thereof in suitable vectors for the expression of thepolypeptides of the invention may be given by injection, inhalation orby mouth. Suitable vectors include M. bovis BCG, M. smegmatis or othermycobacteria, Corynebacteria, Salmonella or other agents according toestablished protocols.

Polypeptides of the invention or fragments thereof in substantiallyisolated form may be used as vaccines by injection, inhalation, oraladministration or by transcutaneous application according to standardprotocols. Adjuvants (such as Iscoms or polylactide-coglycolideencapsulation), cytokines such as IL-12 and other immunomodulators maybe used for the selective enhancement of the cell mediated or humoralimmunological responses. Vaccination with polynucleotides and/orpolypeptides of the invention may be undertaken to increase thesusceptibility of pathogenic mycobacteria to antimicrobial agents invivo.

In instances wherein the polypeptide is correctly configured so as toprovide the correct epitope, but is too small to be immunogenic, thepolypeptide may be linked to a suitable carrier.

A number of techniques for obtaining such linkage are known in the art,including the formation of disulfide linkages usingN-succinimidyl-3-(2-pyridylthio) propionate (SPDP) and succinimidyl4-(N-maleimido-methyl)cyclohexane-1-carboxylate (SMCC) obtained fromPierce Company, Rockford, Ill., (if the peptide lacks a sulfhydrylgroup, this can be provided by addition of a cysteine residue). Thesereagents create a disulfide linkage between themselves and peptidecysteine residues on one protein and an amide linkage through theepsilon-amino on a lysine, or other free amino group in the other. Avariety of such disulfide/amide-forming agents are known. See, forexample, Immun Rev (1982) 62:185. Other bifunctional coupling agentsform a thioether rather than a disulfide linkage. Many of thesethioether-forming agents are commercially available and include reactiveesters of 6-maleimidocaproic acid, 2-bromoacetic acid, 2-iodoaceticacid, 4-(N-maleimido-methyl)cyclohexane-1-carboxylic acid, and the like.The carboxyl group can be activated by combining them with succinimideor 1-hydroxyl-2-nitro-4-sulfonic acid, sodium salt. Additional methodsof coupling antigens employs the rotavirus/“binding peptide” systemdescribed in EPO Pub. No. 259,149, the disclosure of which isincorporated herein by reference. The foregoing list is not meant to beexhaustive, and modifications of the named compounds can clearly beused. Any carrier may be used which does not itself induce theproduction of antibodies harmful to the host. Suitable carriers aretypically large, slowly metabolized macromolecules such as proteins;polysaccharides, such as latex functionalized Sepharose®, agarose,cellulose, cellulose beads and the like; polymeric amino acids, such aspolyglutamic acid, polylysine, polylactide-coglycolide and the like;amino acid copolymers; and inactive virus particles. Especially usefulprotein substrates are serum albumins, keyhole limpet hemocyanin,immunoglobulin molecules, thyroglobulin, ovalbumin, tetanus toxoid, andother proteins well known to those skilled in the art.

The immunogenicity of the epitopes may also be enhanced by preparingthem in mammalian or yeast systems fused with or assembled withparticle-forming proteins such as, for example, that associated withhepatitis B surface antigen. See, e.g., U.S. Pat. No. 4,722,840.Constructs wherein the epitope is linked directly to theparticle-forming protein coding sequences produce hybrids which areimmunogenic with respect to the epitope. In addition, all of the vectorsprepared include epitopes specific to HBV, having various degrees ofimmunogenicity, such as, for example, the pre-S peptide.

In addition, portions of the particle-forming protein coding sequencemay be replaced with codons encoding an epitope of the invention. Inthis replacement, regions which are not required to mediate theaggregation of the units to form immunogenic particles in yeast ormammals can be deleted, thus eliminating additional HBV antigenic sitesfrom competition with the epitope of the invention.

Vaccines may be prepared from one or more immunogenic polypeptides ofthe invention. These polypeptides may be expressed in various host cells(e.g., bacteria, yeast, insect, or mammalian cells), or alternativelymay be isolated from viral preparations or made synthetically.

In addition to the above, it is also possible to prepare live vaccinesof attenuated microorganisms which express one or more recombinantpolypeptides of the invention. Suitable attenuated microorganisms areknown in the art and include, for example, viruses (e.g., vacciniavirus), as well as bacteria.

The preparation of vaccines which contain an immunogenic polypeptide(s)as active ingredients, is known to one skilled in the art. Typically,such vaccines are prepared as injectables, or as suitably encapsulatedoral preparations and either liquid solutions or suspensions; solidforms suitable for solution in, or suspension in, liquid prior toinjestion or injection may also be prepared. The preparation may also beemulsified, or the protein encapsulated in liposomes. The activeimmunogenic ingredients are often mixed with excipients which arepharmaceutically acceptable and compatible with the active ingredient.Suitable excipients are, for example, water, saline, dextrose, glycerol,ethanol, or the like and combinations thereof. In addition, if desired,the vaccine may contain minor amounts of auxiliary substances such aswetting or emulsifying agents, pH buffering agents, and/or adjuvantswhich enhance the effectiveness of the vaccine. Examples of adjuvantswhich may be effective include but are not limited to: aluminumhydroxide, N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP),N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 11637, referred to asnor-MDP),N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(1′-2′-dipalmitoyl-sn-glycero-3-hydroxyphosphoryloxy)-ethylamine(CGP 19835A, referred to as MTP-PE), and RIBI, which contains threecomponents extracted from bacteria, monophosphoryl lipid A, trehalosedimycolate and cell wall skeleton (MPL+TDM+CWS) in a 2% squalene/TWEEN80 (polyoxyethylene sorbitan monooleate) emulsion. The effectiveness ofan adjuvant may be determined by measuring the amount of antibodiesdirected against an immunogenic polypeptide containing an antigenicsequence resulting from administration of this polypeptide in vaccineswhich are also comprised of the various adjuvants.

The vaccines are conventionally administered parenterally, by injection,for example, either subcutaneously or intramuscularly. Additionalformulations which are suitable for other modes of administrationinclude suppositories, oral formulations or as enemas. Forsuppositories, traditional binders and carriers may include, forexample, polyalkylene glycols or triglycerides; such suppositories maybe formed from mixtures containing the active ingredient in the range of0.5% to 10%, preferably 1%-2%. Oral formulations include such normallyemployed excipients as, for example, pharmaceutical grades of mannitol,lactose, starch, magnesium stearate, sodium saccharine, cellulose,magnesium carbonate, and the like. These compositions take the form ofsolutions, suspensions, tablets, pills, capsules, sustained releaseformulations or powders and contain 10%-95% of active ingredient,preferably 25%-70%.

The proteins may be formulated into the vaccine as neutral or saltforms. Pharmaceutically acceptable salts include the acid addition salts(formed with free amino groups of the peptide) and which are formed withinorganic acids such as, for example, hydrochloric or phosphoric acids,or such organic acids such as acetic, oxalic, tartaric, maleic, and thelike. Salts formed with the free carboxyl groups may also be derivedfrom inorganic bases such as, for example, sodium, potassium, ammonium,calcium, or ferric hydroxides, and such organic bases as isopropylamine,trimethylamine, 2-ethylamino ethanol, histidine, procaine, and the like.

The vaccines are administered in a manner compatible with the dosageformulation, and in such amount as will be prophylactically and/ortherapeutically effective. The quantity to be administered, which isgenerally in the range of 5 μg to 250 μg, of antigen per dose, dependson the subject to be treated, capacity of the subject's immune system tosynthesize antibodies, mode of administration and the degree ofprotection desired. Precise amounts of active ingredient required to beadministered may depend on the judgement of the practitioner and may bepeculiar to each subject.

The vaccine may be given in a single dose schedule, or preferably in amultiple dose schedule. A multiple dose schedule is one in which aprimary course of vaccination may be with 1-10 separate doses, followedby other doses given at subsequent time intervals required to maintainand or reenforce the immune response, for example, at 1-4 months for asecond dose, and if needed, a subsequent dose(s) after several months.The dosage regimen will also, at least in part, be determined by theneed of the individual and be dependent upon the judgement of thepractitioner.

In a further aspect of the invention, there is provided an attenuatedvaccine comprising a normally pathogenic mycobacteria which harbours anattenuating mutation in any one of the genes encoding a polypeptide ofthe invention. The gene is selected from the group of ORFs A, B, C, D,E, F, G and H, including the homologous ORFs B, C, E and F in Mtb.

The mycobacteria may be used in the form of killed bacteria or as a liveattenuated vaccine. There are advantages to a live attenuated vaccine.The whole live organism is used, rather than dead cells or selected cellcomponents which may exhibit modified or denatured antigens. Proteinantigens in the outer membrane will maintain their tertiary andquaternary structures. Therefore the potential to elicit a goodprotective long term immunity should be higher.

The term “mutation” and the like refers to a genetic lesion in a genewhich renders the gene non-functional. This may be at either the levelof transcription or translation. The term thus envisages deletion of theentire gene or substantial portions thereof, and also point mutations inthe coding sequence which result in truncated gene products unable tocarry out the normal function of the gene.

A mutation introduced into a bacterium of the invention will generallybe a non-reverting attenuating mutation. Non-reverting means that forpractical purposes the probability of the mutated gene being restored toits normal function is small, for example less than 1 in 10⁶ such asless than 1 in 10⁹ or even less than 1 in 10¹².

An attenuated mycobacteria of the invention may be in isolated form.This is usually desirable when the bacterium is to be used for thepurposes of vaccination. The term “isolated” means that the bacterium isin a form in which it can be cultured, processed or otherwise used in aform in which it can be readily identified and in which it issubstantially uncontaminated by other bacterial strains, for examplenon-attenuated parent strains or unrelated bacterial strains. The term“isolated bacterium” thus encompasses cultures of a bacterial mutant ofthe invention, for example in the form of colonies on a solid medium orin the form of a liquid culture, as well as frozen or dried preparationsof the strains.

In a preferred aspect, the attenuated mycobacterium further comprises atleast one additional mutation. This may be a mutation in a generesponsible for the production of products essential to bacterial growthwhich are absent in a human or animal host. For example, mutations tothe gene for aspartate semi-aldehyde dehydrogenase (asd) have beenproposed for the production of attenuated strains of Salmonella. The asdgene is described further in Gene (1993) 129; 123-128. A lesion in theasd gene, encoding the enzyme aspartate β-semialdehyde dehydrogenasewould render the organism auxotrophic for the essential nutrientdiaminopelic acid (DAP), which can be provided exogenously during bulkculture of the vaccine strain. Since this compound is an essentialconstituent of the cell wall for gram-negative and some gram-positiveorganisms and is absent from mammalian or other vertebrate tissues,mutants would undergo lysis after about three rounds of division in suchtissues. Analogous mutations may be made to the attenuated mycobacteriaof the invention.

In addition or in the alternative, the attenuated mycobacteria may carrya recA mutation. The recA mutation knocks out homologousrecombination—the process which is exploited for the construction of themutations. Once the recA mutation has been incorporated the strain willbe unable to repair the constructed deletion mutations. Such a mutationwill provide attenuated strains in which the possibility of homologousrecombination to with DNA from wild-type strains has been minimized.RecA genes have been widely studied in the art and their sequences areavailable. Further modifications may be made for additional safety.

The invention further provides a process for preparing a vaccinecomposition comprising an attenuated bacterium according to theinvention process comprises (a) inoculating a culture vessel containinga nutrient medium suitable for growth of said bacterium; (b) culturingsaid bacterium; (c) recovering said bacteria and (d) mixing saidbacteria with a pharmaceutically acceptable diluent or carrier.

Attenuated bacterial strains according to the invention may beconstructed using recombinant DNA methodology which is known per se. Ingeneral, bacterial genes may be mutated by a process of targetedhomologous recombination in which a DNA construct containing a mutatedform of the gene is introduced into a host bacterium which it is desiredto attenuate. The construct will recombine with the wild-type genecarried by the host and thus the mutated gene may be incorporated intothe host genome to provide a bacterium of the present invention whichmay then be isolated.

The mutated gene may be obtained by introducing deletions into the gene,e.g by digesting with a restriction enzyme which cuts the codingsequence twice to excise a portion of the gene and then religating underconditions in which the excised portion is not reintroduced into the cutgene. Alternatively frame shift mutations may be introduced by cuttingwith a restriction enzyme which leaves overhanging 5′ and 3′ termini,filling in and/or trimming back the overhangs, and religating. Similarmutations may be made by site directed mutagenesis. These are onlyexamples of the types of techniques which will readily be at thedisposal of those of skill in the art.

Various assays are available to detect successful recombination. In thecase of attenuations which mutate a target gene necessary for theproduction of an essential metabolite or catabolite compound, selectionmay be carried out by screening for bacteria unable to grow in theabsence of such a compound. Bacteria may also be screened withantibodies or nucleic acids of the invention to determine the absence ofproduction of a mutated gene product of the invention or to confirm thatthe genetic lesion introduced—e.g. a deletion—has been incorporated intothe genome of the attenuated strain.

The concentration of the attenuated strain in the vaccine will beformulated to allow convenient unit dosage forms to be prepared.Concentrations of from about 10⁴ to 10⁹ bacteria per ml will generallybe suitable, e.g. from about 10⁵ to 10⁸ such as about 10⁶ per ml. Liveattenuated organisms may be administered subcutaneously orintramuscularly at up to 10⁸ organisms in one or more doses, e.g fromaround 10⁵ to 10⁸, e.g about 10⁶ or 10⁷ organisms in a single dose.

The vaccines of the invention may be administered to recipients to treatestablished disease or in order to protect them against diseases causedby the corresponding wild type mycobacteria, such as inflammatorydiseases such as Crohn's disease or sarcoidosis in humans or Johne'sdisease in animals. The vaccine may be administered by any suitableroute. In general, subcutaneous or intramuscular injection is mostconvenient, but oral, intranasal and colorectal administration may alsobe used.

The following Examples illustrates aspects of the invention.

EXAMPLE 1

Tests for the presence of the GS identifier sequence were performed on 5μl bacterial DNA extracts (25 μg/ml to 500 μg/ml) using polymerase chainreaction based on the oligonucleotide primers5′-GATGCCGTGAGGAGGTAAAGCTGC-3′ (Seq ID No. 40) and5′-GATACGGCTCTTGAATCCTGCACG-3′ (Seq ID No. 41) from within theidentifier DNA sequences (Seq. ID Nos 1 and 2). PCR was performed for 40cycles in the presence of 1.5 mM magnesium and an annealing temperatureof 58° C. The presence or absence of the correct amplification productindicated the presence or absence of GS identifier sequence in thecorresponding bacterium. GS identifier sequence is shown to be presentin all the laboratory and field strains of Mptb and Mavs tested. Thisincludes Mptb isolates 0025 (bovine CVL Weybridge), 0021 (caprine,Moredun), 0022 (bovine, Moredun), 0139 (human, Chiodini 1984), 0209,0208, 0211, 0210, 0212, 0207, 0204, 0206 (bovine, Whipple 1990). AllMptb strains were IS900 positive. The Mavs strains include 0010 and 0012(woodpigeon, Thorel) 0018 (armadillo, Portaels) and 0034, 0037, 0038,0040 (AIDS, Hoffner). All Mavs strains were IS902 positive. Onepathogenic M. avium strain 0033 (AIDS, Hoffner) also contained GSidentifier sequence. GS identifier sequence is absent from othermycobacteria including other M. avium, M. malmoense, M. szulgai, M.gordonae, M. chelonei, M. fortuitun, M. phlei, as well as E. coli, S.areus, Nocardia sp, Streptococcus sp. Shigella sp. Pseudomonas sp.

EXAMPLE 2

To obtain the full sequence of GS in Mavs and Mptb we generated agenomic library of Mavs using the restriction endonuclease EcoRI andcloning into the vector pUC18. This achieved a representative librarywhich was screened with ³²P-labelled identifier sequence yielding apositive clone containing a 17 kbp insert. We constructed a restrictionmap of this insert and identified GS as fragments unique to Mavs andMptb and not occurring in laboratory strains of M. avium. Thesefragments were sub-cloned into pUC18 and pGEM4Z. We identified GScontained within an 8 kb region. The full nucleotide sequence wasdetermined for GS on both DNA strands using primer walking and automatedDNA sequencing. DNA sequence for GS in Mptb was obtained usingoverlapping PCR products generated using PwoDNA polymerase, aproofreading thermostable enzyme. The final DNA sequences were derivedusing the University of Wisconsin GCG gel assembly software package.

EXAMPLE 3

The DNA sequence of GS in Mavs and Mptb was found to be more than 99%homologous. The ORFs encoded in GS were identified using GeneRunner andDNAStar computer programmes. Eight ORFs were identified and designatedGSA, GSB, GSC, GSD, GSE, GSF, GSG and GSH. Database comparisons werecarried out against the GenEMBL Database release version 48.0 (9/96),using the BLAST and BLIXEM programmes. GSA and GSB encoded proteins of13.5 kDa and 30.7 kDa respectively, both of unknown functions. GSCencoded a protein of 38.4 kDa with a 65% homology to the amino acidsequence of rfbD of V. cholerae, a 62% amino acid sequence homology togmd of E. coli and a 58% homology to gca of Ps. aeruginosa which are allGDP-D-mannose dehydratases. Equivalent gene products in H. influenzae,S. dysenteriae, Y. enterocolitica, N. gonorrhoea, K. pneumoniae and rfbDin Salmonella enterica are all involved in ‘O’-antigen processing knownto be linked to pathogenicity. GSD encoded a protein of 37.1 kDa whichshowed 58% homology at the DNA level to wcaG from E. coli, a geneinvolved in the synthesis and regulation of capsular polysaccharides,also related to pathogenicity. GSE was found to have a >30% amino acidhomology to rfbT of V. cholerae, involved in the transport of specificLPS components across the cell membrane. In V. cholerae the gene productcauses a seroconversion from the Inaba to the Ogawa ‘epidemic’ strain.GSF encoded a protein of 30.2 kDa which was homologous in the range25-40% at the amino acid level to several glucosyl transferases such asrfpA of K. pneumoniae, rfbB of K. pneumoniae, lgtD of H. influenzae, lsiof N. gonorrhoae. In E. coli an equivalent gene galE adds β-1-3N-acetylglucosamine to galactose, the latter only found in ‘O’ and ‘M’antigens which are also related to pathogenicity. GSH comprising theORFs GSH₁ and GSH₂ encodes a protein totalling about 60 kDa which is aputative transposase with a 40-43% homology at the amino acid level tothe equivalent gene product of IS21 in E. coli. This family of insertionsequences is broadly distributed amongst gram negative bacteria and isresponsible for mobility and transposition of genetic elements. An IS21—like element in B. fragilis is split either side of the β-lactamase genecontrolling its activation and expression. We programmed an E. coli S30cell-free extract with plasmid DNA containing the ORF GSH under thecontrol of a lac promoter in the presence of a ³⁵S-methionine, anddemonstrated the translation of an abundant 60 kDa protein. The proteinshomologous to GS encoded in other organisms are in general highlyantigenic. Thus the proteins encoded by the ORFs in GS may be used inimmunoassays of antibody or cell mediated immuno-reactivity fordiagnosing infections caused by mycobacteria, particularly Mptb, Mavsand Mtb. Enhancement of host immune recognition of GS encoded proteinsby vaccination using naked specific DNA or recombinant GS proteins, maybe used in the prevention and treatment of infections caused by Mptb,Mavs and Mtb in humans and animals. Mutation or deletion of all or someof the ORFs A to H in GS may be used to generate attenuated strains ofMptb, Mavs or Mtb with lower pathogenicity for use as living or killedvaccines in humans and animals. Such vaccines are particularly relevantto Johne's disease in animals, to diseases caused by Mptb in humans suchas Crohn's disease, and to the management of tuberculosis especiallywhere the disease is caused by multiple drug-resistant organisms.

1. A vector carrying a polynucleotide comprising: (a) a polynucleotideencoding the polypeptide of SEQ ID NO:24; or (b) a polynucleotide thatencodes a polypeptide having at least 90% sequence identity to thepolypeptide of SEQ ID NO:24.
 2. A vector according to claim 1 whereinsaid polynucleotide comprises the polynucleotide of SEQ ID NO:23.
 3. Avector according to claim 1, which is an expression vector.
 4. A vectoraccording to claim 3, wherein said polynucleotide is operably linked toa control sequence which is capable of providing for the expression ofthe coding sequence of the polynucleotide.
 5. A vector according toclaim 1 which comprises one or more components selected from the groupconsisting of an origin of replication, a promoter for expression of thepolypeptide encoded by said polynucleotide, a regulator of a promoterfor expression of the polypeptide encoded by said polypeptide, anenhancer and a selectable marker gene.
 6. A vector according to claim 5,wherein said promoter is a mammalian, viral, yeast or bacterialpromoter.
 7. A vector according to claim 6, wherein said promoter isselected from the group consisting of: a metallothionien promoter, anadenovirus promoter, the SV40 large T promoter, a retroviral LTRpromoter, the polyhedrin promoter, an alcohol dehydrogenase promoter anda β-galactosidase promoter.
 8. A vector according to claim 1, which isadapted for use in vivo.
 9. A vector according to claim 1, which is aplasmid, virus or phage vector.
 10. A vector according to claim 9,wherein said viral vector is selected from the group consisting ofretroviral vectors, adenoviral vectors, adeno-associated viral vectors,vaccinia virus vectors, herpes virus vector and alpha virus vectors. 11.A host cell comprising, transformed with or transfected by a vectoraccording to claim
 1. 12. A host cell according to claim 11, which is abacterial, yeast, insect or mammalian cell.
 13. A host cell according toclaim 12 which is selected from the group consisting of M. bovis BCG, M.smegmatis, a mycobacterium, Corynebacteria and Salmonella.
 14. Apharmaceutical composition comprising: (a) a polynucleotide encoding thepolypeptide of SEQ ID NO:24; (b) a polynucleotide that encodes apolypeptide having at least 90% sequence identity to the polypeptide ofSEQ ID NO:24; (c) a vector according to claim 1; or (d) a host cellcontaining said vector. and a pharmaceutically acceptable carrier ordiluent.
 15. A composition according to claim 15 wherein saidpolynucleotide comprises the polynucleotide of SEQ ID NO:23.